Mining HIV protease cleavage data using genetic programming with a sum-product function
نویسندگان
چکیده
MOTIVATION In order to design effective HIV inhibitors, studying and understanding the mechanism of HIV protease cleavage specification is critical. Various methods have been developed to explore the specificity of HIV protease cleavage activity. However, success in both extracting discriminant rules and maintaining high prediction accuracy is still challenging. The earlier study had employed genetic programming with a min-max scoring function to extract discriminant rules with success. However, the decision will finally be degenerated to one residue making further improvement of the prediction accuracy difficult. The challenge of revising the min-max scoring function so as to improve the prediction accuracy motivated this study. RESULTS This paper has designed a new scoring function called a sum-product function for extracting HIV protease cleavage discriminant rules using genetic programming methods. The experiments show that the new scoring function is superior to the min-max scoring function. AVAILABILITY The software package can be obtained by request to Dr Zheng Rong Yang.
منابع مشابه
A Mathematical Programming Model and Genetic Algorithm for a Multi-Product Single Machine Scheduling Problem with Rework Processes
In this paper, a multi-product single machine scheduling problem with the possibility of producing defected jobs, is considered. We concern rework in the scheduling environment and propose a mixed-integer programming (MIP) model for the problem. Based on the philosophy of just-in-time production, minimization of the sum of earliness and tardiness costs is taken into account as the objective fu...
متن کاملSearching for discrimination rules in protease proteolytic cleavage activity using genetic programming with a min-max scoring function.
This paper presents an algorithm which is able to extract discriminant rules from oligopeptides for protease proteolytic cleavage activity prediction. The algorithm is developed using genetic programming. Three important components in the algorithm are a min-max scoring function, the reverse Polish notation (RPN) and the use of minimum description length. The min-max scoring function is develop...
متن کاملMining association rules for HIV-1 protease cleavage site prediction
Several machine learning techniques, like neural networks, nonlinear support vector machines and decision trees, have been used to model the specificity of HIV-1 protease and to extract specific patterns from peptides cleaved by this protease. Despite many studies, no perfect rules are already known to determine the cleavage of a peptide by HIV-1 protease. These rules are useful for designing s...
متن کاملMolecular detection of proteolytic activity of human parechovirus 2A protein by gene expression
Parechoviruses form one of the nine genera in the picornaviridae family, and include two human pathogens: Human parechovirus type1 and 2 (Hpev1 and Hpev2). The genome of picornaviruses encodes a single polyprotein, which undergoes a cleavage cascade performed by virus encoded proteases to give the final virus proteins. The primary cleavage occurs by 2A protein and this step is critical for vi...
متن کاملA Fast and Self-Repairing Genetic Programming Designer for Logic Circuits
Usually, important parameters in the design and implementation of combinational logic circuits are the number of gates, transistors, and the levels used in the design of the circuit. In this regard, various evolutionary paradigms with different competency have recently been introduced. However, while being advantageous, evolutionary paradigms also have some limitations including: a) lack of con...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 20 18 شماره
صفحات -
تاریخ انتشار 2004